BB-Graph: A New Subgraph Isomorphism Algorithm for Efficiently Querying Big Graph Databases

نویسندگان

  • Merve Asiler
  • Adnan Yazici
چکیده

With the emergence of the big data concept, the big graph database model has become very popular since it provides strong modeling for complex applications and fast querying, especially for the cases that require costly join operations in RDBMs. However, it is a big challenge to find all exact matches of a query graph in a big graph database, which is known as the subgraph isomorphism problem. Although a number of related studies exist in literature, there is need for a better algorithm that works efficiently for all types of queries since the subgraph isomorphism problem is NP-hard. The current subgraph isomorphism approaches have been built on Ullmann’s idea of focusing on the strategy of pruning out the irrelevant candidates. Nevertheless, for some graph databases and queries, the existing pruning techniques are not adequate to handle some of the complex queries. Moreover, many of those existing algorithms need large indices that cause extra memory consumption. Motivated by these, we introduce a new subgraph isomorphism algorithm, namely BB-Graph, for querying big graph databases in an efficient manner without requiring a large data structure to be stored in main memory. We test and compare our proposed BB-Graph algorithm with two popular existing ones, GraphQL and Cypher of Neo4j. Our experiments are done on a very big graph database application (Population Database) and the publicly available World Cup graph database application. We show that our algorithm performs better than those that we use for comparison in this study, for most of the query types.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Subgraph Isomorphism Search in Massive Graph Databases

Subgraph isomorphism search is a basic task in querying graph data. It consists to find all embeddings of a query graph in a data graph. It is encountered in many real world applications that require the management of structural data such as bioinformatics and chemistry. However, Subgraph isomorphism search is an NPcomplete problem which is prohibitively expensive in both memory and time in mas...

متن کامل

Efficient algorithms for supergraph query processing on graph databases

We study the problem of processing supergraph queries on graph databases. A graph database D is a large set of graphs. A supergraph query q on D is to retrieve all the graphs in D such that q is a supergraph of them. The large number of graphs in databases and the NP-completeness of subgraph isomorphism testing make it challenging to efficiently processing supergraph queries. In this paper, a n...

متن کامل

A Simple Algorithm for Subgraph Queries in Big Graphs

Subgraph queries also known as subgraph isomorphism search is a fundamental problem in querying graph-like structured data. It consists to enumerate the subgraphs of a data graph that match a query graph. This problem arises in many real-world applications related to query processing or pattern recognition such as computer vision, social network analysis, bioinformatic and big data analytic. Su...

متن کامل

Topological Queries on Graph-structured XML Data: Models and Implementations

In many applications, data is in graph structure, which can be naturally represented as graph-structured XML. Existing queries defined on tree-structured and graph-structured XML data mainly focus on subgraph matching, which can not cover all the requirements of querying on graph. In this paper, a new kind of queries, topological query on graph-structured XML is presented. This kind of queries ...

متن کامل

GiS: Fast Indexing and Querying of Graph Structures

We propose a new way of indexing a large database of graphs and processing exact subgraph matching (or subgraph isomorphism) and approximate (full) graph matching queries. Rather that decomposing a graph into smaller units (e.g., paths, trees, graphs) for indexing purposes, we represent each graph in the database by its graph signature, which is essentially a multiset, and each signature is the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1706.06654  شماره 

صفحات  -

تاریخ انتشار 2017